Implementation and performance analysis of efficient index structures for DNA search algorithms in parallel platforms

نویسندگان

  • Nuno Sebastião
  • Gustavo Encarnação
  • Nuno Roma
چکیده

Because of the large datasets that are usually involved in deoxyribonucleic acid (DNA) sequence alignment, the use of optimal local alignment algorithms (e.g., Smith–Waterman) is often unfeasible in practical applications. As such, more efficient solutions that rely on indexed search procedures are often preferred to significantly reduce the time to obtain such alignments. Some data structures that are usually adopted to build such indexes are suffix trees, suffix arrays, and the hash tables of q-mers. This paper presents a comparative analysis of highly optimized parallel implementations of index-based search algorithms using these three distinct data structures, considering two different parallel platforms: a homogeneous multi-core central processing unit (CPU) and a NVidia Fermi graphics processing unit (GPU). Contrasting to what happens with CPU implementations, the obtained experimental results reveal that GPU implementations clearly favor the suffix arrays, because of the achieved performance in terms of memory accesses. Furthermore, the results also reveal that both the suffix trees and suffix arrays outperform the hash tables of q-mers when dealing with the largest datasets. When compared with a quad-core CPU, the results demonstrate the possibility to achieve speedups as high as 65 with the GPU when considering a suffix-array index, thus making it an adequate choice for high-performance bioinfomatics applications. Copyright © 2012 John Wiley & Sons, Ltd.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

AN EFFICIENT OPTIMIZATION PROCEDURE BASED ON CUCKOO SEARCH ALGORITHM FOR PRACTICAL DESIGN OF STEEL STRUCTURES

Different kinds of meta-heuristic algorithms have been recently utilized to overcome the complex nature of optimum design of structures. In this paper, an integrated optimization procedure with the objective of minimizing the self-weight of real size structures is simply performed interfacing SAP2000 and MATLAB® softwares in the form of parallel computing. The meta-heuristic algorithm chosen he...

متن کامل

Index Structures for Distributed Text Databases

The Web has became an obiquitous resource for distributed computing making it relevant to investigate new ways of providing efficient access to services available at dedicated sites. Efficiency is an ever-increasing demand which can be only satisfied with the development of parallel algorithms which are efficient in practice. This tutorial paper focuses on the design, analysis and implementatio...

متن کامل

High Performance Implementation of Fuzzy C-Means and Watershed Algorithms for MRI Segmentation

Image segmentation is one of the most common steps in digital image processing. The area many image segmentation algorithms (e.g., thresholding, edge detection, and region growing) employed for classifying a digital image into different segments. In this connection, finding a suitable algorithm for medical image segmentation is a challenging task due to mainly the noise, low contrast, and steep...

متن کامل

High Performance Implementation of Fuzzy C-Means and Watershed Algorithms for MRI Segmentation

Image segmentation is one of the most common steps in digital image processing. The area many image segmentation algorithms (e.g., thresholding, edge detection, and region growing) employed for classifying a digital image into different segments. In this connection, finding a suitable algorithm for medical image segmentation is a challenging task due to mainly the noise, low contrast, and steep...

متن کامل

Implementation of the direction of arrival estimation algorithms by means of GPU-parallel processing in the Kuda environment (Research Article)

Direction-of-arrival (DOA) estimation of audio signals is critical in different areas, including electronic war, sonar, etc. The beamforming methods like Minimum Variance Distortionless Response (MVDR), Delay-and-Sum (DAS), and subspace-based Multiple Signal Classification (MUSIC) are the most known DOA estimation techniques. The mentioned methods have high computational complexity. Hence using...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Concurrency and Computation: Practice and Experience

دوره 27  شماره 

صفحات  -

تاریخ انتشار 2015